NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Measuring the Uncertainty of Environmental Good Preferences with Bayesian Deep Learning

https://doi.org/10.1145/3524458.3547250

Flores, Ricardo; Tlachac, ML; Rundensteiner, Elke A. (September 2022, 2022 ACM Conference on Information Technology for Social Good)

Due to climate change and resulting natural disasters, there has been a growing interest in measuring the value of social goods to our society, like environmental conservation. Traditionally, the stated preference, such as contingent valuation, captures an economics-perspective on the value of environmental goods through the willingness to pay (WTP) paradigm. Where the economics theory to estimate the WTP using machine learning is the random utility model. However, the estimation of WTP depends on rather simple preference assumptions based on a linear functional form. These models are therefore unable to capture the complex uncertainty in the human decision-making process. Further, contingent valuation only uses the mean or median estimation of WTP. Yet it has been recognized that other quantiles of the WTP would be valuable to ensure the provision of social goods. In this work, we propose to leverage the Bayesian Deep Learning (BDL) models to capture the uncertainty in stated preference estimation. We focus on the probability of paying for an environmental good and the conditional distribution of WTP. The Bayesian deep learning model connects with the economics theory of the random utility model through the stochastic component on the individual preferences. For testing our proposed model, we work with both synthetic and real-world data. The results on synthetic data suggest the BDL can capture the uncertainty consistently with different distribution of WTP. For the real-world data, a forest conservation contingent valuation survey, we observed a high variability in the distribution of the WTP, suggesting high uncertainty in the individual preferences for social goods. Our research can be used to inform environmental policy, including the preservation of natural resources and other social good.
more » « less
Full Text Available
Automated Construction of Lexicons to Improve Depression Screening with Text Messages

https://doi.org/10.1109/JBHI.2022.3203345

Tlachac, ML; Shrestha, Avantika; Shah, Mahum; Litterer, Benjamin; Rundensteiner, Elke A. (August 2022, IEEE Journal of Biomedical and Health Informatics)

Given that depression is one of the most prevalent mental illnesses, developing effective and unobtrusive diagnosis tools is of great importance. Recent work that screens for depression with text messages leverage models relying on lexical category features. Given the colloquial nature of text messages, the performance of these models may be limited by formal lexicons. We thus propose a strategy to automatically construct alternative lexicons that contain more relevant and colloquial terms. Specifically, we generate 36 lexicons from fiction, forum, and news corpuses. These lexicons are then used to extract lexical category features from the text messages. We utilize machine learning models to compare the depression screening capabilities of these lexical category features. Out of our 36 constructed lexicons, 14 achieved statistically significantly higher average F1 scores over the pre-existing formal lexicon and basic bag-of-words approach. In comparison to the pre-existing lexicon, our best performing lexicon increased the average F1 scores by 10%. We thus confirm our hypothesis that less formal lexicons can improve the performance of classification models that screen for depression with text messages. By providing our automatically constructed lexicons, we aid future machine learning research that leverages less formal text.
more » « less
Full Text Available
Text Generation to Aid Depression Detection: A Comparative Study of Conditional Sequence Generative Adversarial Networks

https://doi.org/10.1109/BigData55660.2022.10020224

Tlachac, ML; Gerych, Walter; Agrawal, Kratika; Litterer, Benjamin; Jurovich, Nicholas; Thatigotla, Saitheeraj; Thadajarassiri, Jidapa; Rundensteiner, Elke A. (December 2022, 2022 IEEE International Conference on Big Data (Big Data))

Corpuses of unstructured textual data, such as text messages between individuals, are often predictive of medical issues such as depression. The text data usually used in healthcare applications has high value and great variety, but is typically small in volume. Generating labeled unstructured text data is important to improve models by augmenting these small datasets, as well as to facilitate anonymization. While methods for labeled data generation exist, not all of them generalize well to small datasets. In this work, we thus perform a much needed systematic comparison of conditional text generation models that are promising for small datasets due to their unified architectures. We identify and implement a family of nine conditional sequence generative adversarial networks for text generation, which we collectively refer to as cSeqGAN models. These models are characterized along two orthogonal design dimensions: weighting strategies and feedback mechanisms. We conduct a comparative study evaluating the generation ability of the nine cSeqGAN models on three diverse text datasets with depression and sentiment labels. To assess the quality and realism of the generated text, we use standard machine learning metrics as well as human assessment via a user study. While the unconditioned models produced predictive text, the cSeqGAN models produced more realistic text. Our comparative study lays a solid foundation and provides important insights for further text generation research, particularly for the small datasets common within the healthcare domain.
more » « less
Full Text Available

Search for: All records